Error in loadNamespace(x): there is no package called 'EcoData'
Day 4 - Introduction to Data Analysis with R
Freie Universität Berlin - Theoretical Ecology
February 28, 2025
Schedule of today
Now - 14 (or 14.30 if you are enthusiastic still): Work on the data set(s)
14 (14.30) - 15: Short feedback round
15-16: Feedback, conclusion
Physicochemical properties of wine and quality judgements
Error in loadNamespace(x): there is no package called 'EcoData'
dplyrdplyr::mutate and as.factor() to tranform the columnjanitor::clean_names() functionMost important variables:
| variable | class | description |
|---|---|---|
| gender | character | Binary gender |
| event | character | Event name |
| medal | character | Medal type |
| athlete | character | Athlete name (LAST NAME first name |
| abb | character | Country abbreviation |
| country | character | Country name |
| type | character | Type of sport |
| year | double | year of games |
Get the data:
dplyr!is.na(medal))Atlantic marsh fiddler crab (Minuca pugnax)
Error in library(lterdatasampler): there is no package called 'lterdatasampler'
Error: object 'pie_crab' not found
Ideas - known methods
Temperature and ice duration on lakes since 19th century
Ice data:
Error: object 'ntl_icecover' not found
Temperature data:
Error: object 'ntl_airtemp' not found
Ideas - known methods
dplyr::left_join to combined the tables with annual mean temperature and ice duration
dplyr session or look at the helpData from FU et al. 2015, Nature Cell Biology
Data found via Tutorial on heat maps using this data
3 csv files:
heatmap_genes.csv: A list of the names of interesting genes to look at (Genes used in Figure 6b in paper)DE_results.csv: Gene expression in luminal cells in pregnant versus lactating mice
normalized_counts: Normalized counts for genes for the different samplesData cleaning:
janitor::clean_names function to make the column headers nicerDE_results and normalized_counts by their shared columnsselect to remove columns you don’t need for analysis to get a better overviewp_value < 0.01 & abs(logFC) > 0.58)Data analysis:
pheatmap::pheatmap()
pheatmap takes a matrix as input (use as_matrix on tibble to transform)scale function
pheatmap can scale but with ggplot you have to scale before plottingcorrplot package for correlation plotsfactoextra package for PCA visualizationNA values: use tidyr::drop_na() to remove all NA values from the data firstWorking with real research data
Meet in your group (if you want)
Work on your data set
Take breaks as you need and be back at 2 p.m.
Keep an eye on your group and the general chat
In 1-2 mins:
What was the highlight of your analysis?
What was difficult?
If you want: Share a screenshot in the chat or share your screen
Please take 10 mins to complete the feedback survey for the Graduate center (don’t use Internet Explorer)
We learned a lot of stuff!
Selina Baldauf // Bring your own data